min 0
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Alaska (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Data Science (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Machine Learning Guided Optimal Transmission Switching to Mitigate Wildfire Ignition Risk
Huang, Weimin, Piansky, Ryan, Dilkina, Bistra, Molzahn, Daniel K.
Abstract--T o mitigate acute wildfire ignition risks, utilities de-energize power lines in high-risk areas. The Optimal Power Shut-off (OPS) problem optimizes line energization statuses to manage wildfire ignition risks through de-energizations while reducing load shedding. OPS problems are computationally challenging Mixed-Integer Linear Programs (MILPs) that must be solved rapidly and frequently in operational settings. For a particular power system, OPS instances share a common structure with varying parameters related to wildfire risks, loads, and renewable generation. This motivates the use of Machine Learning (ML) for solving OPS problems by exploiting shared patterns across instances. In this paper, we develop an ML-guided framework that quickly produces high-quality de-energization decisions by extending existing ML-guided MILP solution methods while integrating domain knowledge on the number of energized and de-energized lines. Results on a large-scale realistic California-based synthetic test system show that the proposed ML-guided method produces high-quality solutions faster than traditional optimization methods.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Hawaii (0.04)
- North America > United States > Wisconsin > Brown County > Howard (0.04)
- (2 more...)
- Government > Regional Government > North America Government > United States Government (0.69)
- Energy > Power Industry > Utilities (0.48)
Convergence Analysis of SGD under Expected Smoothness
Kawamoto, Yuta, Iiduka, Hideaki
Stochastic gradient descent (SGD) is the workhorse of large-scale learning, yet classical analyses rely on assumptions that can be either too strong (bounded variance) or too coarse (uniform noise). The expected smoothness (ES) condition has emerged as a flexible alternative that ties the second moment of stochastic gradients to the objective value and the full gradient. This paper presents a self-contained convergence analysis of SGD under ES. We (i) refine ES with interpretations and sampling-dependent constants; (ii) derive bounds of the expectation of squared full gradient norm; and (iii) prove $O(1/K)$ rates with explicit residual errors for various step-size schedules. All proofs are given in full detail in the appendix. Our treatment unifies and extends recent threads (Khaled and Richtárik, 2020; Umeda and Iiduka, 2025).
Extragradient Method for $(L_0, L_1)$-Lipschitz Root-finding Problems
Choudhury, Sayantan, Loizou, Nicolas
Introduced by Korpelevich in 1976, the extragradient method (EG) has become a cornerstone technique for solving min-max optimization, root-finding problems, and variational inequalities (VIs). Despite its longstanding presence and significant attention within the optimization community, most works focusing on understanding its convergence guarantees assume the strong L-Lipschitz condition. In this work, building on the proposed assumptions by Zhang et al. [2024b] for minimization and Vankov et al.[2024] for VIs, we focus on the more relaxed $α$-symmetric $(L_0, L_1)$-Lipschitz condition. This condition generalizes the standard Lipschitz assumption by allowing the Lipschitz constant to scale with the operator norm, providing a more refined characterization of problem structures in modern machine learning. Under the $α$-symmetric $(L_0, L_1)$-Lipschitz condition, we propose a novel step size strategy for EG to solve root-finding problems and establish sublinear convergence rates for monotone operators and linear convergence rates for strongly monotone operators. Additionally, we prove local convergence guarantees for weak Minty operators. We supplement our analysis with experiments validating our theory and demonstrating the effectiveness and robustness of the proposed step sizes for EG.
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Alaska (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Data Science (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
GRPO-$λ$: Credit Assignment improves LLM Reasoning
Parthasarathi, Prasanna, Reymond, Mathieu, Chen, Boxing, Cui, Yufei, Chandar, Sarath
Large language models (LLMs) are increasingly deployed for tasks requiring complex reasoning, prompting significant interest in improving their reasoning abilities through post-training. Especially RL based methods using verifiable reward, like the state-of-the-art GRPO, have shown to tremendously improve reasoning behaviors when applied as post-training methods. However, the lack of an explicit reward or critic model limits GRPO's ability to assign fine-grained credit across token sequences. In this work, we present GRPO-$λ$, a novel extension to GRPO that enhances credit assignment in RL finetuning of LLMs for complex reasoning tasks. We approximate learning from $λ$-return with a reformulation of eligibility traces using token-level log-probabilities applied after each sequence generation, and a novel critic-free approximation of the temporal-difference error. We introduce a few variations for the weighting of the $λ$-return, and their applications to the eligibility-trace, where all the variations provide significant gains over GRPO. We compare GRPO-$λ$ against GRPO by training models from 1.5B to 7B parameters on $4$ different math reasoning datasets. The training plots demonstrate 30-40% improved performance during RL training on both LLaMA-3.1 and Qwen-2.5 architectures. Finally, we show that with GRPO-$λ$, the resulting average performance on AIME24, Math500, OlympiadMath, MinervaMath, and AMC improves over GRPO by over $3$ points and a $4.5$ points improvement on the 7B model.
A Omitted proofs
A.3 Formulation of bound constrained dual problem Proposition 1 . For any non-negative p, q, we generate a feasible p ˆ, ˆ q as follows. In Section 5.3, we describe that it can be helpful to regularize We also mention here a minor difference in derivations for convenience of readers. As expected, this term also appears in these other formulations [ 25, 42 ]. All experiments run on a single P100 GPU. This adjustment was not necessary for CNN experiments.
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.47)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.47)
- Energy > Oil & Gas > Midstream (0.47)
Optimizing Input Data Collection for Ranking and Selection
We study a ranking and selection (R&S) problem when all solutions share common parametric Bayesian input models updated with the data collected from multiple independent data-generating sources. Our objective is to identify the best system by designing a sequential sampling algorithm that collects input and simulation data given a budget. We adopt the most probable best (MPB) as the estimator of the optimum and show that its posterior probability of optimality converges to one at an exponential rate as the sampling budget increases. Assuming that the input parameters belong to a finite set, we characterize the $\epsilon$-optimal static sampling ratios for input and simulation data that maximize the convergence rate. Using these ratios as guidance, we propose the optimal sampling algorithm for R&S (OSAR) that achieves the $\epsilon$-optimal ratios almost surely in the limit. We further extend OSAR by adopting the kernel ridge regression to improve the simulation output mean prediction. This not only improves OSAR's finite-sample performance, but also lets us tackle the case where the input parameters lie in a continuous space with a strong consistency guarantee for finding the optimum. We numerically demonstrate that OSAR outperforms a state-of-the-art competitor.
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Improving the Convergence Rates of Forward Gradient Descent with Repeated Sampling
Dexheimer, Niklas, Schmidt-Hieber, Johannes
Forward gradient descent (FGD) has been proposed as a biologically more plausible alternative of gradient descent as it can be computed without backward pass. Considering the linear model with $d$ parameters, previous work has found that the prediction error of FGD is, however, by a factor $d$ slower than the prediction error of stochastic gradient descent (SGD). In this paper we show that by computing $\ell$ FGD steps based on each training sample, this suboptimality factor becomes $d/(\ell \wedge d)$ and thus the suboptimality of the rate disappears if $\ell \gtrsim d.$ We also show that FGD with repeated sampling can adapt to low-dimensional structure in the input distribution. The main mathematical challenge lies in controlling the dependencies arising from the repeated sampling process.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (5 more...)